Automatic Speech Segmentation with Hmm

نویسندگان

John Dines

Sridha Sridharan

چکیده

ABSTRACT: In this paper we review aspects of our automatic speech segmentation system that has been utilised in conjunction with our speech synthesis research. The speech segmentation system is based on a hidden Markov model phone recogniser using training strategies optimised for the segmentation task. Our research includes an analysis of the various aspects of the phone recogniser’s design and identifying the distinctions between paradigms of parameter estimation for speech segmentation and recognition. We also look at the limitations of HMM based segmentation and techniques for overcoming these limitations. The system evaluation demonstrates the ability of our system to provide high reliability speech segmentation that is comparable in performance to other state of the art systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Segmentation Combining and Spectral Boundary

Currently, AT&T Labs’ Natural Voices multilingual TTS system produces high-quality synthetic speech with a largescale speech corpus [1]. In the development of such systems, automatic segmentation constitutes a major component technology. The prevalent approach for automatic segmentation in speech synthesis is Hidden Markov Model (HMM) based. Even though an HMM-based approach is the most automat...

متن کامل

Automatic Speech Segmentation Based on HMM

This contribution deals with the problem of automatic phoneme segmentation using HMMs. Automatization of speech segmentation task is important for applications, where large amount of data is needed to process, so manual segmentation is out of the question. In this paper we focus on automatic segmentation of recordings, which will be used for triphone synthesis unit database creation. For speech...

متن کامل

HMM-based automatic visual speech segmentation using facial data

We describe automatic visual speech segmentation using facial data captured by a stereo-vision technique. The segmentation is performed using an HMM-based forced alignment mechanism widely used in automatic speech recognition. The idea is based on the assumption that using visual speech data alone for the training might capture the uniqueness in the facial component of speech articulation, asyn...

متن کامل

A study of HMM-based automatic segmentations for Thai continuous speech recognition system

Speech segmentations have been widely using in many speech applications. In speech synthesis, the quality of produced speech depends on the accuracy of labeled acoustic inventory. In speech recognition, segmented utterances according to the labels are usually used as a starting point for training speech models. The segmentation is often manually encoded which is timeconsumption process and has ...

متن کامل

A Sphinx Based Speech-music Segmentation Front-end for Improving the Performance of an Automatic Speech Recognition System in Turkish

In this study a system that segments an audio signal as speech and music by using posterior probability based features is proposed and implemented in Sphinx. Unlike the earlier efforts that uses Multi-Layer Perceptrons (MLP), this system uses Hidden-MarkovModel based acoustic models that are trained in Sphinx for posterior probability calculations. Acoustic Models are trained with the HMM-state...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Automatic Speech Segmentation with Hmm

نویسندگان

چکیده

منابع مشابه

Automatic Segmentation Combining and Spectral Boundary

Automatic Speech Segmentation Based on HMM

HMM-based automatic visual speech segmentation using facial data

A study of HMM-based automatic segmentations for Thai continuous speech recognition system

A Sphinx Based Speech-music Segmentation Front-end for Improving the Performance of an Automatic Speech Recognition System in Turkish

عنوان ژورنال:

اشتراک گذاری